June I945 


THE BIOMETRICS SECTION, AMERICAN STATISTICAL ASSOCIATION 
emt emcee ea er ee ee TE 


STATISTICAL METHOD IN FORESTRY 


F. X. ScHUMACHER 
School of Forestry, Duke University 


The profession of forestry in the United 
States is dedicated to the development and sus- 
ained use of forestlands—comprising about 
me-third of our land area—and their re- 
jources of forest, forage, soil and water. Man- 
igement of forest properties of the order of 
ens, or hundreds of thousands of acres gener- 
ites problems involving inventory and growth 
yf standing timber, utilization of forest prod- 
icts, control of fire, soil erosion and the dep- 
edations of disease and insects, methods of 
larvesting timber so as to insure natural re- 
roduction or the conservation of water and af- 
orestation. 


The solution of most of these implies ap- 
lication of the statistical method; not always, 
ethaps, in such refined form as to gratify’ the 
rofessional statistician, but usually with suff- 
ient effectiveness for the particular purpose. 


One of the common activities of the prac- 
icing forester is the taking of a sample inven- 
ory or “cruise” of standing timber before sale 
r purchase. In this work he has, for gener- 
tions indeed, been making use of the method 
f double sampling—a statistical technique 
ommonly believed to be very modern. The 
srester has found it time-consuming and ex- 
ensive to measure the volume of all the 
idividual trees, say in board feet or in cords, 
n his sample of plots within the area under 
onsideration. In consequence, he may merely 
lly the frequency distribution of tree diam- 
ters on all the sample plots, a comparatively 


quick operation for the number of trees in- 
volved;while the average tree volume in each 
diameter class, derived from a free-hand curve 
of volume on diameter, is based upon an in- 
dependent sample of volume measurements 
on 50 or more individual sample trees. 


Only an insignificant proportion of such 
routine sampling jobs provides for randomiza- 
tion and the reckoning of sampling error. The 
practicing forester does not take readily to 
randomization of sampling in the field work of 
the forest inventory; for the timber cruise 
is practically always a joint project with for- 
est mapping—of cover-type areas at least, and 
often of age-class areas and topography as 
well—for which random sampling is wholly 
inadequate. Furthermore, in the administra- 
tion of some thousands of acres, even a very 
precise estimate of the total volume of stand- 
ing timber thereon is not in itself sufficiently 
useful information. An additional requirement 
is that its distribution be given according to 
species groups within fairly small subdivisions 
(such as 40-acre compartments) of the prop- 
erty. 


Systematic sampling conforms with these 
needs of forest management. Tree measure- 
ments are taken on parallel strips, usually one 
chain wide, or on lines of sample plots. They 
cover, perhaps, 1 to 5 percent of the area 
if the population sampled extends over several 
thousand acres, and 10 to 20 percent if over 
several hundred. 


Although representative random sampling 
of the forest inventory has been amply dem- 
onstrated (17), the practicing forester retains 
the conviction, based on long experience, that 
systematic sampling is more representative than 
representative random sampling as the latter 
term is understood by the statistician. This 
predisposition to systematic sampling led Os. 
borne (15) to study the comparative precision 
of systematic and random sampling of cover- 
type areas, and Hasel (9) to an analogous 
study of timber-volume samplings. Each con- 
cluded that the forester’s persuasion is well- 
founded in fact. 


Their findings have gratified those for- 
esters of the U. S. Forest Service, charged with 
the responsibility for the nation-wide forest 
survey, initiated about 1930. In the Lake 
States, South Atlantic, and Gulf States the 
forest survey had relied upon field sampling 
of a systematic pattern in which each “release 
unit” of several million acres was traversed with 
a system of parallel lines, 10 miles apart. At 
10-chain distances along the lines, the required 
data were taken, usually on fifth-acre sample 
plots. There have been no serious complaints 
regarding either the cost or the reliability of 
the survey estimates of forest areas, volume or 
growth of standing timber, according to the 
more important cover types. 


When required to sample less familiar 
populations than areas of standing timber, the 
forester raises little or no objection to the use 
of random sampling. Thus the erosion control 
survey of the Tennessee River watershed (19, 
20) consisted of data taken according to a rep- 
resentative random sampling design. The wa- 
tershed of 26 million acres was divided into 
324 blocks of equal area, and a random sample 
of two 1,100-acre plots was drawn from each 
block. The sampling scheme was practical 
and efficient because excellent planimetric 
maps showing forested and non-forested areas 
had already been completed by the engineers 
of the Tennessee Valley Authority. Coverage 
was about 2.7 percent; and the standard errors 
about 8 percent, at a cost of only $4,300 for 
the field work. 


Another illustration is afforded by the 


annual sample census of U. S. lumber pro- 
duction by the Bureau of the Census and the 


30 


Forest Service. In the twelve western states 
no sampling is involved since the population 
of sawmills—relatively few and mostly large— 
is completely canvassed. But in the remain- 
ing States, some 36 thousand sawmills were 
reported active in 1942. As their production 
varied greatly, the population of mills in each 
state was divided into five production classes 
in thousands of board feet as follows: 1-49, rep- 
resented by 10,189 mills; 50-499, by 16,125 
mills; 500-999, by 4,560 mills; 1,000-4,999, by 
4,756 mills; 5,000+, by 538 mills. Sinoe 
the standard deviation of production among 
mills within classes varied widely from one 
production to another, sampling was not pro- 
portional to the number of mills. Instead the 
sample for each production class consisted of 
2, 14, 22, 66 and 100 percent, respectively, of 
the number of sawmills therein. 


Investigations on the effects of grazing by 
livestock upon forage production and utiliza- 
tion in the western National Forests depend 
upon accurate sampling appraisals. Forage 
varies considerably in the weight of plant ma- 
terial produced by each species in a highly 
variable population. Complete reliance upon 
direct sampling of weight on sufficient num- 
bers of clipped plots is too laborious and ex- 
pensive. With practice, on the other hand, 
weight of forage can be estimated rapidly and 
fairly accurately by eye. Hence the clue to 
efficient sampling of forage values lies in ad- 
justing eye-estimates of weight to weight by a 
scale-balance by double sampling (21). The 
regression equation of measured weight on the 
corresponding eye-estimate is derived from the 
data of relatively few plots for which both 
values are taken. Upon inserting into the 
equation the average eye-estimate of weight 
over the aggregate of all plots of a given 
observer, the adjusted average weight is read-. 
ily calculated. 


Practical methods for sampling popula- 
tions of depredating forest insects, and of fur 
and game animals, have not been completely 
achieved although progress has been shown 
(7, 16); . 


The statistical method is indispensible in 
experimental forestry. Ever since the estab- 
lishment of the eleven regional forest and 
range experiment stations about 1926, programs 


of in-service training in statistical method have 
been provided by the Forest Service and by 
the Graduate School of the Department of Ag- 
riculture. Perhaps the most notable of these 
was the course of lectures at Asheville, N. C., 
by Professor R. A. Fisher in 1936 on the de- 
sign of experiments, attended by a group 
of forty foresters from the experiment stations 
and forestry schools. Fisher’s appetite for 
practical problems, his ready comprehension 
of intricate detail, and his helpful advice on 
modes of solution were a stimulating experi- 
ence to men somewhat accustomed to consider 
research in forestry as something unique. 


Experiments in cutting for sustained tim- 
ber yield, or for maximum water yield and 
control of erosion in watershed management, 
have been successfully and efficiently pursued 
through replication in randomized blocks (14). 
Treatments are commonly the method of over- 
story cutting, of grazing, of ground prepara- 
tion, or the use of fire as a silvicultural tool. 
Imposed at two or three levels, they fit well into 
factorial designs. A singularity of this class 
of problem is the large plot-size of five acres 
or so, such that a single block may cover 40 
acres or more. They lend themselves well, 
however, to the study of subsidiary effects— 
such as may be ascribable to methods of brush 
disposal—by split-plot arrangements. 


Sustained maximum timber yield is the 
objective of annual or periodic harvesting of 
mature trees whether singly or in groups. The 
residual stand must be equitably distributed 
among the younger age classes—including the 
new natural reproduction—for best develop- 
ment. Successful natural reproduction is the 
test of silviculture. It depends, in the first 
place, upon the adequacy and dispersal of the 
seed supply in both harvested and residual tim- 
ber; and in the second place, upon the suita- 
bility of the forest floor after harvesting or 
freatment, as a seedbed for germination and 
growth of seedlings. The factors of seed sup- 
ply and its dispersal in natural reproduction 
are estimated quantitatively by sampling with 
seed-traps (10). The net result of supply, 
lispersion and establishment can be inves- 
igated through regression analysis (12) in 
which the dependent variate, the proportion 
of mil-acre quadrats stocked with one or more 
seedlings, is transformed to probits (1) and 


31 


plotted against the average logarithm of the 
number of seedlings to the quadrat among the 
plots of the timber-cutting experiment. 


Standard designs in randomized blocks 
or Latin squares are used regularly in experi- 
ments on_ artificial reproduction. These in- 
clude tests of the effects of fertilizer or other 
soil treatments on the quality of nursery stock, 
or of the quality of nursery stock on the sur- 
vival rate and early growth of plantations. 
Studies in afforestation, however, can become 
unusually complex. Certain classes of “treat- 
ment” cannot be imposed at will on a given 
experimental area, but require preparatory ex- 
ploration for potential block situations where 
they already prevail. Confounding of cer- 
tain effects is the inevitable consequence. Thus 
in his search for practical forestation methods 


of converting low-value Ozark oak forests to 


mixtures of oak and shortleaf pine, Chapman 
(4) was forced to confound two very pertinent 
factors of environment—namely, density of oak 
overwood and quality of site—with the blocks 
of his experiment. In this case however, the 
loss of useful information was trivial since 
the chief interest lay not in the main effects 
of these factors, but in their more precisely 
evaluated interactions with treatments imposed 
within the blocks; such as the interaction of 
density of overwood with root-pruning, and 
age, of the planted stock. 


The need for efficient statistical tools in 
forestry had first been felt in correlation work 
involving the graphic expression of timber-tree 
volume in terms of tree-size, and the growth 
of even-aged stands. Such relationships are 
invariably curvilinear and they are usually 
solved by graphical analysis (13). Theoreti- 
cal regression equations and appropriate 
weights for observational data are still in the 
process of development. In the meantime, 
however, a graphic control of free-hand curves 
of a dependent variate on each of two or more 
independent variates is provided by the aline- 
ment chart as developed by Bruce and Reineke 
(2). Initially the relation among the variates 
is approximated graphically or by a linear 
regression equation. Through successive ad- 
justment of its scales the chart is transformed 
to the final graphic expression of the rela- 
tionship in question. Even when the form of 
regression equation is acceptable theoretically 


the alinement chart serves as an excellent 
graphic test of linearity (3 Sec. 151). 


Investigations dealing with forest mensura- 
tion now consistently make use of regression 
analysis. Up to about fifteen years ago em- 
phasis was placed on yield studies (growth 
curves) of even-aged stands of the more im- 
portant timber types. Recent emphasis has 
been on quantitative definition of degree of 
stocking (5, 6) and the rate at which full 
stocking of timber stands is approached; to 
sample scaling (11) and the analysis of lum- 
ber volume according to tree species and lum- 
ber grade as manufactured from sawlogs of 
various sizes (18). Regression analysis has 
also been useful in elucidating the effect of fire 


on gum yield (for turpentine, rosin, etc.) of 

(1) Bliss, C. L 
1935. 

(2) Bruce, D., and L. H. 

Tech. Bull. No. ae: 
Bruce. D. and F. 
425 pp. 1942. 


Reineke. 
931. 


. Schumacher, 


The calculation of the dosage-mortality curve. 


orstian. Loblolly pine seed production and dispersal. 


longleaf and slash pine (8); and in describing 
the effect of rainfall distribution on the growth 
of timber trees and forage grasses. 


While practitioners are the clientele of the 
research forester, they are busy and practical 
men and have little interest in the statistical 
method as such. It therefore devolves upon the 
research forester to present the results of in- 
vestigation or experiment in simple and attrac- 
tive terms so that their import can be grasped 
readily. Hence the statistical method is seldom 
mentioned by the authors—and not always rec- 
ognized by the readers—of the 10 or 15 percent 
of the papers which are based upon its appli- 
cation among those published in the profes- 
sional Journal of Forestry. To the practicing 
forester this is as it should be. 


Ann. App. Bio. 22:134-167. 


Correlation alinement charts in forest research. U.S.D.A. 
Forest mensuration. McGraw-Hill Book Co. 2nd ed. 
Chapman, A. G. Classes of shortleaf pine nursery stock for planting in the Missouri 


On the tree-area ratio and certain of its applica- 


1944, 


Studies on a population cycle snowshoe hares on the Lake 


U.S.D.A. Circular — 


Jour. Agri. Res. 57:713-736. 1938. 


Concerning the dispersion of natural reproduction. 
Yield of even- “aged stands of ponderosa pine. U.S.D.A. Tech. Bull. No. 630. 
Wilm. Effect of cutting mature lodgepole-pine stands on 
Sampling errors of systematic ne random surveys of cover-type areas. 


Prebble, M. L. Sampling methods in population studies of the European spruce sawfly 


in eastern Canada. Transactions of the Royal Society of Canada. Section V. 37:93- 126. 


poe ee ae in forestry and range 


Young. Empirical log rules according to species groups 
Further notes on work unit erosion control surveys. 


The erosion control job in the Tennessee Watershed. 


Ozarks. Jour. For. 42:818-826. 1944. 
(5) Chisman, H. H. and F. X. Schumacher. 
tions Jour. For. 38:311-317. 1940. 
(6) Gevorkiantz, S. R. Measuring stand normality. Jour. For. 42:503-508. 
(7) Green, R. G. and C. A. Evans. 
Alexander area. I Gross annual censuses, 1932-1939. 
(8) el i Mer Effects of fire on gum yields of longleaf and slash pines. 
(9) Hasel, A. A. Sampling error in timber surveys. 
(10) Jemison, G. M. and C. F. K 
For. 42:734-741. 1944. 
(11) ee ae ae Sale of stumpage on the basis of tree measurement. 
(12) Lynch, D. W. and F. Seti geen: 
Jour. For. 39:49-. sh 1941 
(13) Brey ere. W. iH. 
(14) Niederhof, C. H. and H. G. 
rainfall interception. Jour. For. 41:57-61. 
(15) Osborne, J 
(16) Jour. Am. Stat. Assn. 37:256- 264. 1942. 
3 
(17) Schumacher, F. X. and_R. A. Chapman. 
management. Duke Univ. School of For. Bul 
(18) Schumacher, F. X. and H. E. 
and lumber gg Jour. For. 41:511-518. 1943. 
(19) Seigworth, K. J. E. Snyder. 
Jour. For. 39: 805. 695. 1941. 
(20) Seigworth, K. J. and J. E. Snyder. 
he For. 41:442-443. 1943. 
(21) Wilm, G., D. F. Costello and G. E. Klipple. 


Oe eS method. Jour. Am. Soc. Agron. 36:194-203. 1944. 


Application for entry as second-class matter is pending. The Biometrics Bulletin is pub- 


lished six times a year—in February, April, June, August, 
American Statistical Association for its Biometrics Section. 


N. W., Washington 6, D. C. 


October and December—by the 
Editorial Office: 1603 K Street,. 


Jour. | 
Jour. For. 40:845-853. 


Estimating forage yield by the double- 


Membership dues in the Biometrics Section and the American Statistical Association com- 


bined are $6.00 per year, including the JOURNAL OF THE AMERICAN STATISTICAL 


ASSOCIATION, 


the BIOMETRICS BULLETIN, and the ASA BULLETIN; 


for Associate 


Members of the Section dues are $2.00 per year, which includes the BIOMETRICS BULLETIN. 
Single copies are 60 cents and annual subscriptions, $3.00. Subscriptions and ee for 


membership should be sent to the American Statistical Association, 1603 K Street, N 


ington 6, D. C. 


LW Wash- 


TEACHING STATISTICS AT THE DEPARTMENT OF AGRICULTURE 
GRADUATE SCHOOL IN WASHINGTON 


The Graduate School in the Department 
of Agriculture occupies a unique position in 
education. The faculty is composed almost en- 
tirely of professional men in the federal service. 
Many of these men have at some time been 
instructors in a university, in addition, many 
have had experience in industry. In their daily 
work they are faced with the problems of the 
practice of their professions, thus they carry to 
the classroom healthy enthusiasm and vigor, 
and a balance between theory and administra- 
tion that is rarely duplicated. 


The sponsors of the Graduate School have 
in mind a function of importance outside the 
scope of the great universities of the country. 
The School gives no degrees. Rather, the aim 
is to supplement the education of men and 
women in the federal service who for the most 
part already have degrees but who now, in their 
maturity, foresee the need of other training. 
Classes are held after official hours, not to 
interfere with official duties. 


Among the professions represented in 
Washington probably none rates higher than 
the statisticians’, and perhaps no profession 
was ever so concentrated in one city. During 
the past few years there has been increasing 
recognition and dependence among other pro- 
fessional groups and management in the service 
of both government and industry on current 
quantitative information which the trained stat- 
istician alone is capable of obtaining and eval- 
uating. In the economic programs that are 
at present under consideration there will be 
increased demand for samples that will provide 
reliable national and local totals and indexes 
of agricultural production, population shifts, 
consumer preference, employment, unemploy- 
ment, rent, sales, inventories (as of shoes and 
tires), vacancy and occupancy, opinions, and 
the like, at low-cost and quick timing. These 
requirements call for a mixture of the high- 
est grade of experience and research in sam- 
pling theory, with administrative ability and 
knowledge of various lines of subject-matter. 


Professor Hotelling, in his article in the 
April issue of the Biometrics Bulletin, pointed 


33 


out the need for increased specialization in sta- 
tistical work and its teaching. His contention 
must be supported and his recommendations 
put into action; otherwise the disparity be- 
tween the supply and demand for competent 
statisticians will widen to the point where the 
rest of the world will learn to thrive without 
statistics, or on a mediocre product. The need 
of specialization is made imperative by the in- 
creasingly wide variety of projects which sta- 
tisticians must be prepared to work in, owing 
largely to the splendid statistical work that has 
recently been done in the War Department, 
Navy, N. D. R. C., industry and government 
civilian agencies. The statistician will hence- 


forth find himself increasingly indispensible in” 


all phases of industry and government, from 
the purchase of materials, studies of consumer 
preferences, consumer standards, design of 
product, control of quality in manufacturing, 
purchasing, and selling; also in personnel ad- 
ministration, polling, and social and economic 
surveys of all kinds so necessary in the success- 
ful administration of government or business. 
A statistician in the government service must 
be prepared to work on the design of samples 
for population or agricultural characteristics, 
inventories, rent, vacancy, number of stores, ex- 
penditures, number of workers, cost of handling 
furs or manufacturing shoes, or a particular 
price line of coats. Statisticians in industry 
and Agricultural Experiment Stations could 
likely match this list with an equally varied 
one, but the point is that the statistician must 
be a specialist in his own right, and be trans- 
ferable over a wide variety of applications. In 
the Graduate School the Department of Sta- 
tistics has always been separate and has never 
been overshadowed by any other department. 
The mathematics courses there are primarily 
directed toward providing prerequisites for 
more advanced work in statistics, with under- 
standable exceptions as where a group has 
asked for a course in hydrodynamics or thermo- 
dynamics. 


In the study of statistics in the Grad- 
uate School, emphasis is placed on the needs 
of the federal service. The word “efficiency” 


in the design of samples for government uses 
refers to the amount of information (as meas- 
ured by the coefficient of variation) furnished 
per unit cost, not per case. Moreover, in com- 
puting costs it is necessary to take into con- 
sideration not only the field and office costs 
of collecting and tabulating the data but the 
cost represented as the burden of response on 
the individuals or firms that furnish the in- 
formation. There have been many instances in 
which a decision was made to take a sample 
less than 100 percent when the cost of so 
doing, to the government, was obviously going 
to be equal to or greater than the cost of com- 
plete count, the motive being to lessen the bur- 
den of response on industry. 


The point of view in building the statis- 
tical curriculum is that while thorough train- 
ing in mathematical statistics is the prime 
requisite of a statistician, he is nevertheless 
expected to perform in other capacities as 
well. For instance, he must know a great 
deal about questionnaires, and field and 
office instructions, and must know the subject- 
matter of the enquiry well enough to say what 
data will answer which questions, and what 
data can be collected. He must know whether 
to use a complete count or a sample short of 
a complete count or some other type of partial 
investigation, for all of which he must have 
pretty clear ideas on the cost and time re- 
quired to carry out jobs of various complexities. 
He must look ahead and foresee the need of 
data, thus to recommend statistical programs 
ahead of time and not get caught short with- 
out data that should have been collected last 
quarter. Equally, he is expected to be strong 
in the analysis of good data and bad, and 
should be ready to translate his analyses 
into recommendations for action, tersely and 
forcefully, in simple language for administra- 
tive use (a rare ability indeed). The in- 
structors face these problems daily in their 
work and do not find it difficult to stress 
them. Some of these special problems call 
for special courses, an example being Likert 
and Cannel’s “Techniques of Interviewing,” 
and Jaffe’s “Planning of Statistical Enquiries.” 


The sampling courses offered in the Grad- 
uate School represent at present about the 
only opportunities for learning the new devel- 


opments in areal and cluster sampling, and 
their impact on business surveys, much of 
which has not yet been published . Some of 
the sampling courses apparently have almost 
identical titles and content, but are given by 
different instructors. The aim is to provide 
the opportunity of getting acquainted with 
the Washington specialists and their contribu- 
tions, 


The catalog of the School (obtainable 
on request to the Director or the Chairman 
named below) lists a wide variety of courses 
in many phases of statistical technology and 
administration. The curriculum undergoes 
continual revision with the aim of providing 
courses that will supply needs of the future. 


Special stress is placed on the courses 
in statistics of under-graduate caliber, wherein — 
the students are largely people who have had ~ 
considerable training in economics, sociology, 
agricultural science, or some other field. The 
maturity of the classes calls for instruction 
of the highest leadership. In the graduate 
levels most of the students are engaged in 
statistical work of some kind in various fields 
designing samples, working on interviewing 
and office instructions, questionnaires, analyses 
and adjustments, and writing reports and 
recommendations, 


The Seminars in “Sampling” and “Statis- 
tical Inference” are held approximately every 
three weeks and are open to all graduate 
students and to others upon application. They 
are attended by most of the leading mathe- 
matically inclined statisticians and economists 
in Washington, the average attendance being 
about 60. Statisticians from outside of Wash- 
ington often appear on the programs. It was 
the practice before the war to hold special 
series of lectures by outside speakers: Fisher, 
Shewhart, Neyman, Wishart, Yates and Coch- 
ran have appeared on these programs. 


The Department Committee on Mathematics 
and Statistics is composed of O. C. Stine 
(Bureau of Agricultural Economics), P. M. 
Hauser (Census), B. Ralph Stauber (War 
Relocation Authority), M. <A.  Grishick 
(Bureau of Agricultural Economics and 
Columbia University), with W. Edwards Dem- 
ing (Bureau of the Budget) as Chairman. 


NEWS AND NOTES 


In a recent letter, Marianne E. BeErn- 
STEIN wrote: “When reading the very inter- 
esting article by Dr. Phillip Levine on human 
blood groups in the current issue of the Bio- 
metrics Bulletin I noticed that he refers to 
my father Dr. Ferix BERNSTEIN as ‘A mathe- 
matician who never carried out any blood 
tests’. I was a little girl with long pigtails 
when my father was doing the research on 
the AB blood groups, and both my brother 
and I had our ears pricked by our father; and 
he then examined our blood in a laboratory 
installed in the Institute of Mathematical 
Statistics at the University of Gottingen. In- 
cidentally, the entire family turned out to 
have blood of the ‘O’ type.” 


A copy of the above letter was sent to 
Dr. Puiu Levine and a portion of his 
reply to “This very delightful controversy” 
follows: “The late Dr. Landsteiner, with 
whom I was associated from 1925-1932, fre- 
quently mentioned that Dr. Bernstein, an 
outstanding mathematician, was able to arrive 
at the accepted theory of the heredity of the 
four blood groups without doing any blood 
group determinations. We had silently as- 
sumed that this was the case. I am in 
error, I hasten to apologize, both to Dr. Bern- 
stein and his daughter. It seems to me how- 
ever, that essentially, it was not necessary 
for Dr. Bernstein to do any blood tests him- 
self. It is a great tribute to him that direct 
application of his special gifts to a series 
of figures—representing the varying incidence 
of the four blood groups among several races 
studied by other workers, particularly Hirsch- 
feld—resulted in his presentation of the triple 
allelomorphic theory of the heredity of the 
four blood groups.” 

A Nutrition Work Conference, the third 
in a series of conferences being conducted 
for Southern research investigators in various 
fields, was held in Raleigh, May 14-19, spon- 
sored by the General Education Board and 
the Institute of Statistics. The program in- 
cluded intensive training in statistics along 
with discussions and consultations on indivi- 
dual research problems. R. Comstock, 
GertrupE M. Cox, J. A. Ricney, W. A. Hen- 
DRICKS, SARAH Porter and JEANNE FREEMAN, 
members of the staff of the Institute of 
Statistics, gave lectures and led discussions 
on such topics as the uses of statistics in 
experimentation, available experimental de- 
signs and their applicability to nutrition re- 
search, summarizing data, the proper error 
term, and estimation of work necessary to 
obtain results with a specified degree of 
accuracy. In attendance were GEORGIAN 
Apams, U.S.D.A, Office of Experiment Sta- 
tions; Ruta Boypen, University of Ken- 
tucky; Rosert Carotus, Virginia Truck 
Experiment Station; Mary Dopps, University 


35 


of Tennessee; J. F. Enrart, Virginia Agricul- 
tural Experiment Station; PeTer Henze, Vege- 
table Breeding Laboratory at Charleston; 
Martua Ho ttincer, Louisiana State Univer- 
sity; Harotp M. Hyre, West Virginia Univer- 
sity: Byron E. Janes, University of Florida; 
Sopuiz Marcusr, Bureau of Nutrition and 
OutvE SHEETS, Mississippi State College; 
Home Economics; LAVERNE McWuirter, Mis- 
sissippi State College; RusseL Mitier, Penn- 
sylvania State College; W. J. Peterson, North 
Carolina State College; Ruru Reper. Okla- 
homa Agricultural Experiment Station; F. W. 
SHerwoop, North Carolina State College; 
OuivE SHEETS, Mississippi State College; 
Mary Speirs, Georgia Agricultural Experiment 
Station; and Jessm_ Wurracre, Texas Agri- 
cultural Experiment Station. 


Joun WisHarT wants to clear up his where- 
abouts which was incorrectly reported in the 
April Bulletin... He is with the Admir- 
alty, heading an administrative 
group. 

CoLonEL JosEPpH BERKSON, on leave of ab- 
sence from the Mayo Clinic, has been in the 
Army since May 1942 in his present position 
as Chief of the Statistics Division of the 
Office of the Air Surgeon. He says ‘that 
he is supposed to be responsible for anything 
that pertains to medical or biological statis. 
tics in the Army Air Forces. That birthday 
in May put him over the draft limit so 
maybe we can get him back into civilian 
life and on the editorial committee of the 
Bulletin... Rospert P. Gace is carrying on 
in the Department of Biometry and Medical 
Statistics at the Mayo Clinic. His son Billy 
and daughter Roberta are keeping him busy 
at home since help is at a premium. He says, 
“We are living at a fast pace”... Lr. J. A. 
GREENWOOD, formerly of Duke University, is 
now stationed at the U. S. Naval Air Station, 
Patuxent River, Md., the testing station for 
the Bureau of Aeronautics. He is statistician 
at the armament test unit, and states, “I am 
having fine co-operation in the introduction 
of statistical design and evaluation in their 
tests.” Wonder if his experience with extra- 
sensory perception helps with the armament 
tests... Arian E. Tretoar was found at 
Fort Preble, Maine, working on the wet-cold 
field trials”, but he is back at the Climatic 
Research Laboratory, Lawrence, Massachusetts, 
formulating reports. He hopes to return to 
the University of Minnesota about June 15. 
If you succeed let us know how one gets a 
release from the Statistical Research Group . . 
Perhaps, after June 17, a change in age will 
bring a change in heart to S. S. Witks and 
then maybe he will release R. L. ANDERSON, 
who is needed back on his job at North 
Carolina State College ... Capratn Cari 
E. Marsuatt of the Department of Mathe- 


statistical _ 


matics, Oklahoma A & M College is at the 
Advanced Bombardier School, Big Spring, 
Texas . . . Have you noticed the letter heading 
“Columbia University, Division of War Re- 
search, Statistical Research Group” with 
Haroip Hore..ine as official investigator and 
W. ALLEN WALLIS as director of research? 
We have been promised a fairly full statement 
about this group after the war . . . ABRAHAM 
Wap has been promoted to a professorship 
of mathematical statistics at Columbia Uni- 
versity ... P. L. Hsu of Kunming, China, 
will lecture on advanced multivariate analysis 
and on other fields in mathematical statistics 
at Columbia during the semester beginning 
on January 31, 1946 .. . Jose Cauzapa, agron- 
omist and statistician from the experiment 
station at Lima, Peru, is studying statistics 
at Iowa State College. He is one of the 
Rockefeller Foundation Fellows and expects 
to travel extensively this summer visiting 
experiment stations in the United States. W. 
G. Cocuran, Civilian U.S.S.B.S., APO 413 
YoPostmaster, New York, writes, “London 
distressed me a little at first, but one soon 
gets used to the missing houses, the sub- 
stitution of boards for windows, and the 
general air of dilapidation. The vivid green 
in the fields and parks is a_ heartening 
sight.” He is working long hours and keep- 
ing the information in his head since there 
is no paper available. He witnessed the 
impressive two-day V-E celebration in London 
with the lights ablaze, the people singing, 
and the flags waving. More recently he 
wrote from Germany that he is getting some- 
what inured to the constant scenes of destruc- 
tion. He and his group are billeted in 
tastefully furnished homes of former Nazi 
officials. The German people are making 
their trek back to the demolished towns. He 
wrote that the hull of the Cologne Cathedral 
was standing, and the two spires were still 
an impressive landmark, but there had been 
considerable gutting inside which would re- 
quire long and careful reconditioning. How 
about celebrating July 15 in the United 
States? . SAMUEL Stmonovitz has been 
transferred from the Office of Milk Adminis- 
trator to the State Development Commission 
in Hartford, Connecticut, where he will be 
concerned with the correlation of information 


from state and federal agencies to provide a 
statistical basis for the administrative policies 
of the Commission ... Lowrett J. REED at 
Johns Hopkins University finds much of his 
time taken up with war work in addition 
to his ordinary duties. We can hear others 
saying “me too!” R. Parker, Citrus 
Experiment Station, Riverside, Calif., says, 
“T have had to shelve, I hope temporarily, the 
work on the incorporation of regression in 
the analysis of variance of yields of long- 
term experiment. A yery urgent problem 
concerning the ‘quick decline’ of orange 
trees is being undertaken. Similar troubles 
with sour orange rooting trees have been 
observed in South Africa, South America 
and Java. I am attempting to put in force 
this spring a 5 x 5 Latin square experiment 
on effects of cultivation in citrous orchards.” 
His border rows of grapefruit trees produce 
excellent fruits! ...G. A. Baxer, Division 
of Mathematics and Physics, says they have 
a large group of men there at Davis, Calif., 
who have studied statistics to a certain extent 
and who use it constantly in their research 
work ...H. L. Lucas. research associate, 
Department of Poultry Husbandry at Cornell 
University, has been conducting an informal 
class on experimental designs, analysis of 
data and interpretation of results ...L. N. 
Haze with the Bureau of Animal Industry 
located at Dubois, Idaho, sent in some good 
suggestions about this bulletin. Tell us what 
you want included in future issues... A. 
Pore with the Plant Industry Section at 
Beltsville writes that he feels the section 
on Queries should be especially useful as a 
medium for obtaining answers on questions 
which frequently arise in the course of re- 
search work. Send G. W. SNEDECOR your 
questions. So far, he leads with “fan mail” 
. . . Watter H. Meyer has recently been 
promoted to Professor of Forestry at the Yale 
University School of Forestry, where he gives 
courses in statistical methods, forest manage- 
ment, forest economics and related fields. 
Professor Meyer is President of the Connect- 
icut Chapter of the American Statistical 
Association . . . The chairman of the Editorial 
Committee refused to accept reports on the 
recent trip of Gertrup—E Cox to New York 
City and New Haven, Conn. 


QUERIES 


QUERY A problem that has bothered me is 
the fitting of regression lines when their po- 
sition is restricted in some way. For example, 
suppose a test is made of the relationship be- 
tween the number of fish in a body of water 
and the average number which can be caught 
out of it, with a standard amount of fishing. 
In fitting a regression line to such data, we 
know that the point (0,0) must fall on the 
line, since if no fish are present certainly none 


will be caught. In other words, we have one 
point which is free from sampling error. The 
unique importance of this point will, it seems 
to me, make observations in its neighborhood 


of relatively less importance than observations. 


at a distance from it, where there is no fixed 
guide-post. Do you know of any treatment 
of situations of this sort, by which the best 
straight (or curved) line could be fitted to 
data when there is one point which must be 


satisfied? The standard deviation from re- 
gression (“standard error of estimate”) and 
the standard error of the regression coeff- 
cient, would also be valuable. Or are these con- 
cepts pertinent in such a situation? 


ANSWER Deming gives both a general method 
and some particular solutions of your problem 
(Statistical Adjustment of Data. John Wiley 
and Sons, Inc., pages 30-34). Snedecor opens 
his chapter 6 with an illustration of the simple 
case in which X is measured without error 
and the variance of Y is constant for all values 
of X (Statistical Methods. Collegiate Press, 
Inc., Ames, Iowa). 


Observations in the neighborhood of (0,0) 
may or may not be of less importance than 
those at greater distances; it depends upon the 
variance of Y. One often finds that this vari- 
ance increases with X. In fact, there are 
many situations in which it seems reasonable 


to suppose that in the sampled population the 
standard deviation of Y is directly proportional 
to X. If you think this hypothesis is suitable 
in your fishing, the appropriate method is to 
calculate the ratios, 


xX 


— 
then apply to them the statistical procedures 
suitable for a single variate. 

GreorcE W. SNEDECOR. 


number of fish caught 


total number of fish” 


QUERY The data in the table were analyzed 
by the method of fitting constants, the (in- 
complete) results being shown at the end. 
Are we justified in using this or any other 
statistical approach on the whole body of data 
when the amount of missing information is 
as great as it is here? What are the degrees 
of freedom for interaction? 


TABLE 1. 


CAROTENE IN BLOOD PLASMA OF CATTLE. 


Micrograms per 100 ml. blood plasma (exalted). 


Month of observation 


No. animals 


Age of animals (months) 
4-6 7-9 


and mean 1-3 10-12 
August n 5 8 6 
x 206.6 379.1 517.2 
June n 8 6 5 
x 192.1 478.0 456.6 
April n 5 6 
x 93.6 205.2 
March n 9 2 
x 105.6 189.5 
February n 7 
x 120.3 
December n 4 
x 61.0 
October n 1 1 3 7 
x 55.0 233.0 289.7 282.4 
September n 2 2 6 6 
x 75.0 263.5 254.0 231.2 


Analysis of variance 


Degrees of Sum of Mean 
Source of variation freedom squares square 
Months 7 569,737 81,398 
Ages 3 660,602 220,201 
Interaction ? 104,541 
Individuals within subclasses 79 464,994 5,886 


ANSWER Taking the second question first: the 
degrees of freedom for interaction are 

(7) (3)—number of vacant subclasses; 
that is, 21—12=9. Hence, the mean square for 
interaction is 104,541/9=11,616 and F is 
11,616,/5,886=1.97, just at the 5% point. 


In the method which you have used for fit- 
ting constants the hypothesis is set up that in 
the sampled population the interaction is neg- 
ligible (F. Yates. Jour. Agri. Sci. 23:108; and 
Emp. Jour. Exp. Agri. 1:129). The test of 
significance indicates rejection of that hy- 
pothesis. But examination of the data reveals 
rather orderly progressions of the means with 
increasing age except in August when the 
mean for 7-9 went on up instead of following 
the pattern. Since there are 6 cattle in this 
subclass, it may account for a considerable por- 
tion of the interaction. This should be con- 
sidered before rejecting the hypothesis of neg- 
ligible interaction. If it should be found that 
these 6 animals were treated differently from 
others of the same age (kept longer on pas- 
turage, for example), decision about the hy- 
pothesis might be affected. On the other hand, 
even if there is some interaction in the pop- 
ulation, your estimates of mean square will 
not be greatly in error, and your decisions 
about the effects of month and age will not 
be changed. 


Returning to your first question: Of course, 
you do not have the information that might 
be expected in a complete 4 x 8 table, but 
the method used enables you to extract that 
which is available. I do not know the prob- 
lem which the experiment was designed to 
solve, hence cannot judge the appropriateness 
of the method used. I always feel that infor- 
mation may be lost by classifying a continu- 
ous variate like age in such broad intervals as 
yours. If you are interested in the trend of 
the carotene content, covariance might be 
more suitable. 


QUERY In your Queries Section of Biometrics 
Bulletin Vol. 1, No. 1, p. 9 you say “. . . the 


null hypothesis states that the real relation 
between x and y is a straight line.” Is that 
an acceptable null hypothesis? Shouldn’t the 
null hypothesis be stated in negative terms? 
For example: “the real solution between x and 
y is not a straight line.” 


Is it possible to give a brief clear exposition 
of the advantages that were gained by the in- 
troduction of the term “null hypothesis” into 
statistical literature? 


ANSWER In certain important tests of hy- 
pothesis, e. g., the t-test for the difference 
between two means, the hypothesis is often 
set up with the possibility of rejecting it at 
some designated significance level, and for 
this reason it is often called a “null hypothe- 
sis.” In such cases the “null” part of the 
phrase “null hypothesis” refers to our inter- 
est in a possible rejection rather than that the 
null hypothesis should be stated negatively. 


The phrase “null hypothesis” has come to 
be used by some statisticians in a much broader 
sense than the above. It has been used to des- 
ignate any eligible hypothesis. In this sense 
the “null hypothesis,” in order to be an eligible 
one, must be free from any vagueness in order 
to form the basis for the sampling distribution 
to be used in making tests of significance or 
in setting up confidence limits. The hypothe- 
sis “that the real relation between x and y is a 
straight line” is an eligible hypothesis in the 
sense that it is free from vagueness. It is, 
however, an hypothesis for testing a specifi- 
cation rather than an ordinary hypothesis in 
which the specification is assumed and tests 
are made of certain parameters estimated from 
the data. Such hypotheses may present diffi- 
cult mathematical problems. 


T. A. BANcRorT. 


Editor’s note: It is doubtless possible to give 
a clear exposition of the testing of hypotheses, 
but it may not be brief. This part of the ques- 
tion will be put on the docket for an expository 
article. 


ABSTRACTS 


(11) 


BERKSON, Joseph, M.D. (Mayo Clinic). Appli- 
cation of Logistic Function to Bio-Assay. Jour. 
Amer. Stat. Assn. 39:357-365 Sept. 1944 


When the effect of a drug is measured as the 
sroportion of exposed individuals affected, the 
fect plotted against dosage (or its logarithm) 
requently assumes the form of a symmetric 
igmoidal curve. From this curve the L.D. 50, 
.e., the dose which affects just 50 per cent of 
hose exposed, or any similar measure of dosage 
‘an be estimated. The integral of the normal 
curve has been used to represent the sigmoidal 
unction, and its application advanced exten- 
sively by Bliss, with a method of linear transfor- 
nation as “probits.’? This article advances the 
ise of the “logistic function” and a linear trans- 
ormation with the use of “ogits.”? The curves 
ire similar, but the logistic is analytically simpler 
n a number of respects. The “logit” is simply 
he logarithm of the ratio of p to q, where q is 
che proportion affected and # its complement; 
the first derivative which is utilized to determine 
che weights to be used in the solution is propor- 
‘ional to the simple product of » and g. Thus no 
wuxiliary tables other than one for logarithms are 
-equired. 

A method of fitting equivalent to that used 

“probits” can be applied to “logits.”? In this 
pothod an approximation is implicitly used 


ee (2), 


where Z is the first are with respect to the 
linear transformation of the fitted function, PQ 
the fitted rates and J, 7, the linearly transformed 
variates of the observed and fitted rates respec- 
tively. For ‘probits” Zi is the ordinate of the 
normal curve; for “logits”’ Z is equal to PO, so 
for the logistic the approximation is simplified to 


=P 0 (I--D). 


If ‘corrected’ values are to be used to substitute 
for observed rates, the corrected logit is given by 


This method, both in the case of “‘probits” and 
“logits,”? requires one or more preliminary solu- 


tions to obtain the values of ‘PO and Z. 

A different method is advanced in this 
article for the logistic, depending on a more pre- 
cise approximation 


ea (-f) 
PQ 
where zg corresponds to the observed and Z to the 


fitted rate. For the logistic this simplifies to 
e=pq (1—D. 


39 


The weights now involve only the observed rates, 
and a final solution can be obtained at once, 
without successive approximation. This more- 
over should yield, for any given set of differences, 
a lower x?. A trial for a number of previously 
published series gave a lower x? in these i in- 
stances for “Jogits” than for “‘probits.”” 


(12) 
GOODWIN, Richard H. (Connecticut College). 
Estimates of the Minimum Numbers of Genes 


Solidago. Bulletin of the Torrey Botanical 
Club. 


Interspecific and intraspecific crosses in 
Solidago have been made between various strains. 
From analyses of the F; and F2 populations, esti- 
mates have been made of the minimal numbers of 
genes differentiating these strains. Minimal esti- 
mates for single character differences were calcu- 
lated from the following expression derived by 
Professor Sewall Wright: n=A,?/8(R—1), 
where # equals the number of gene differences, 
A, equals the difference between parental strains, 
in average value of chatacter x divided by the 
standard deviation of that character in the Fi, 
and R,; equals the standard deviation of charac- 
ter x in the F: divided by the standard deviation 
of character x in the F;. The maximal number of 
genes affecting the expression of any two charac- 
ters, mc, were obtained from the following for- 
mula derived by Dr. Donald R. Charles: 


nzi(1—2c)+ne 
=A,Ay(r2RzRy—11)/8 (R= 1) (R= iD 


where mz equals the number of character-x loci 
each linked with a character-y locus, c equals the 
average crossover value between linked pairs of 
genes, A, and R, equal character-y quotients 
analogous to character-x quotients, and 7; and r2 
equal the correlation coefficients between char- 
acters x and y in the F, and F2 respectively. In 
making minimal estimates for the total number 
of genes differentiating all the characters ana- 
lyzed, nc was taken as the nearest integer below 
the exact value for the expression nz(1—2c)-+-ne. 

Taxonomic distinctions in Solidago have 
been based upon morphological criteria. The 
number of genes involved in the control of some 
of the morphological differences in an inter- 
specific cross has been estimated to be at least 
twenty-one; in an intraspecific cross, between 
strains which have been considered subspecif- 
ically distinct, at least four. 

For practical reasons the systematist has 
taken little cognizance of physiological behavior 
in delimiting taxonomic categories. Yet in the 
two intraspecific crosses studied, at least nine or 
ten genes appear to be involved in differentiating 
the parental strains with respect to flowering 
time. In one case, the strains were morphologi- 
cally dissimilar and subspecifically distinct; in 
the other case, morphological differences were 
inadequate to warrant a taxonomic separation. 


(13) 
KNUDSEN, Lila F. (Food and Drug Administra- 
tion). Penicillin Assay. Sci. 101:46 Jan. 12, 1945. 


A simple statistical method is given for de- 
termining the potency of antibiotic substances, 
in terms of a suitable standard and error of assay, 
by means of a chart and a nomograph in con- 
junction with four numbers obtained by certain 
additions and subtractions of the diameters of 
the zones of inhibition of the incubated plates. 
Each plate has a low and a high dose of standard 
having diameters of zone of inhibition sz and sy 
respectively, and a low and high dose of the un- 
known having diameters uz, and ug. For each 
plate two quantities are calculated: 


v= (uz+ux) — (sx+sx) 


w= (se+un) —(st—Uuz). 


The four quantities used in conjunction with the 
chart and nomograph are V=Zv, W==w, 
R,=range of vy and R,=range of w. No further 
calculations are necessary, 

Although they apply specifically to a four- 
plate assay in which the ratio of high dose to 
low dose is 4:1, the chart and nomograph can be 
used for any number of plates and any ratio of 
doses by multiplying the result obtained by an 
appropriate tabled figure. 

The method can be used also in similarly 
designed assays of other drugs or vitamins where, 
for instance, differences between litters are to be 
removed. 


and 


Officers of the American Statistical Asso- 
ciation, President: Walter A. Shewhart, Di- 
rectors: Henry B. Arthur, C. I. Bliss, Simon 
Kuznets, E. Grosvenor Plowman, Willard L. 
Thorp, and Helen M. Walker; Vice-Presidents, 
William G. Cochran, A. D. H. Kaplan, Lowell 
: Reed; Secretary-Treasurer, Lester S. Kel- 
ogg. 


Officers of the Biometrics Section: C. I. Bliss, 
Chairman; H. W. Norton, Secretary. 


Editorial Committee for the BIOMETRICS 


40 


(14) ' ‘ 
KNUDSEN, Lila F. (Food and Drug Administra- 
tion). Control Chart Analysis of Penicillin Assays. 
Jour. of Bact. Aug. 1945. 


Control Chart Analysis can be applied to the © 
penicillin assay abstracted above by setting up 
control charts for W, R» and Ry. W is a con- 
stant function of the slope of the dosage response — 
curve. R,is a measure of the variation of these 
slopes from plate to plate and R, is a measure of 
the between-plate variation of the vertical dis- 
tance between the two parallel dosage response 
curves fitted to the data from each plate. 
Averages and three-sigma control limits on W, 
Ry and R, are calculated from data gathered at 
one laboratory over a few weeks. A separate 
control chart is made for each of the three vari- 
ables with assay number (or time) as the ab- 
scissa and the variable (W, Rw, or Ry as the case 
may be) as the ordinate. The averages and con- 
trol limits are plotted on the charts. As each 
assay is run the values of W, Ry and Ry are 
plotted. If the plotted point is outside the control 
limits, trouble is indicated, possibly in the form 
of contamination or a leaky cup. After the 
trouble is located the assay is repeated. } 


BULLETIN: Gertrude M. Cox, Chairman, 
G. Cochran, F. R. Immer, J. Neyman, H. W. 
Norton, L. J. Reed, G. W. Snedecor, Sewa 

Wright. 


aa 


Material for the BULLETIN should be 
addressed to the Chairman of the Editorial 
Committee, Institute of Statistics, North Caro- 
lina State College, Raleigh, N. C., material 
for Queries should go to “Queries”, Statis- 
tical Laboratory, Iowa State College, Ames, 
Towa, or to any member of the committee. — 


